AITopics | intra-class similarity

Collaborating Authors

intra-class similarity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

What Makes ImageNet Look Unlike LAION

Shirali, Ali, Hardt, Moritz

arXiv.org Artificial IntelligenceJun-27-2023

ImageNet was famously created from Flickr image search results. What if we recreated ImageNet instead by searching the massive LAION dataset based on image captions alone? In this work, we carry out this counterfactual investigation. We find that the resulting ImageNet recreation, which we call LAIONet, looks distinctly unlike the original. Specifically, the intra-class similarity of images in the original ImageNet is dramatically higher than it is for LAIONet. Consequently, models trained on ImageNet perform significantly worse on LAIONet. We propose a rigorous explanation for the discrepancy in terms of a subtle, yet important, difference in two plausible causal data-generating processes for the respective datasets, that we support with systematic experimentation. In a nutshell, searching based on an image caption alone creates an information bottleneck that mitigates the selection bias otherwise present in image-based filtering. Our explanation formalizes a long-held intuition in the community that ImageNet images are stereotypical, unnatural, and overly simple representations of the class category. At the same time, it provides a simple and actionable takeaway for future dataset creation efforts.

imagenet, intra-class similarity, similarity, (15 more...)

arXiv.org Artificial Intelligence

2306.15769

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.34)

Add feedback

Improving Pseudo Labels With Intra-Class Similarity for Unsupervised Domain Adaptation

Wang, Jie, Zhang, Xiao-Lei

arXiv.org Artificial IntelligenceJul-25-2022

Unsupervised domain adaptation (UDA) transfers knowledge from a label-rich source domain to a different but related fully-unlabeled target domain. To address the problem of domain shift, more and more UDA methods adopt pseudo labels of the target samples to improve the generalization ability on the target domain. However, inaccurate pseudo labels of the target samples may yield suboptimal performance with error accumulation during the optimization process. Moreover, once the pseudo labels are generated, how to remedy the generated pseudo labels is far from explored. In this paper, we propose a novel approach to improve the accuracy of the pseudo labels in the target domain. It first generates coarse pseudo labels by a conventional UDA method. Then, it iteratively exploits the intra-class similarity of the target samples for improving the generated coarse pseudo labels, and aligns the source and target domains with the improved pseudo labels. The accuracy improvement of the pseudo labels is made by first deleting dissimilar samples, and then using spanning trees to eliminate the samples with the wrong pseudo labels in the intra-class samples. We have applied the proposed approach to several conventional UDA methods as an additional term. Experimental results demonstrate that the proposed method can boost the accuracy of the pseudo labels and further lead to more discriminative and domain invariant features than the conventional baselines.

domain adaptation, pseudo label, target domain, (16 more...)

arXiv.org Artificial Intelligence

2207.12139

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Very Compact Clusters with Structural Regularization via Similarity and Connectivity

Ma, Xin, Kim, Won Hwa

arXiv.org Artificial IntelligenceJun-14-2021

Clustering algorithms have significantly improved along with Deep Neural Networks which provide effective representation of data. Existing methods are built upon deep autoencoder and self-training process that leverages the distribution of cluster assignments of samples. However, as the fundamental objective of the autoencoder is focused on efficient data reconstruction, the learnt space may be sub-optimal for clustering. Moreover, it requires highly effective codes (i.e., representation) of data, otherwise the initial cluster centers often cause stability issues during self-training. Many state-of-the-art clustering algorithms use convolution operation to extract efficient codes but their applications are limited to image data. In this regard, we propose an end-to-end deep clustering algorithm, i.e., Very Compact Clusters (VCC), for the general datasets, which takes advantage of distributions of local relationships of samples near the boundary of clusters, so that they can be properly separated and pulled to cluster centers to form compact clusters. Experimental results on various datasets illustrate that our proposed approach achieves better clustering performance over most of the state-of-the-art clustering methods, and the data embeddings learned by VCC without convolution for image data are even comparable with specialized convolutional methods.

boundary point, latent space, vcc, (15 more...)

arXiv.org Artificial Intelligence

2106.0543

Country:

North America > United States > Texas (0.04)
North America > United States > California > Alameda County > Oakland (0.04)
Asia > South Korea (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback